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DETAILED ACTION 

1 . This action is responsive to an RCE to application 09/669,680 filed on 2/14/2005. 

2. Claims 1, 3-8, 10-23, and 25-29 are pending in the case. Claims 1,8, 15, 20, and 23 are 
independent claims. Claims 2, 9, and 24 have been cancelled. Claim 30 remains 
cancelled. Claims 1, 8, 15, 20, and 23 have been amended. 

Claim Rejections - 35 USC § 112 
The following is a quotation of the second paragraph of 35 U.S. C. 112: 

The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the 
subject matter which the applicant regards as his invention. 

3. Claims 1, 8, 15, 20, and 23 refer to a "similar" clustering in the bottom three lines of the 
claims; such reference is indefinite and a more precise elaboration of what is meant by 
the term similar is required. Similarly, on the next line, "new, but related" is indefinite; 
the claim does not specify how the dataset is related. 

Claim Rejections - 35 USC § 103 
The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

4. Claims 1-30 are rejected under 35 U.S.C. 103(a) as being unpatentable over Lantrip 
et al. (USPN 6,298,174 Bl— filing date 10/15/1999), hereinafter Lantrip, further in 
view of Ruocco et al. (USPN 5,864,855— filing date 2/26/1996), hereinafter Ruocco. 



/ 
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5. Regarding independent claim 1, Lantrip discloses a method of clustering documents in 
datasets (in col. 2, lines 39-42, document vectors are arranged into clusters) comprising: 
clustering first documents in a first dataset to produce first document classes; (in col. 2, 
lines 39-42, document vectors are arranged into clusters), and creating centroid seeds 
based on said first document classes (in col. 2, lines 43-45, the invention finds centroids). 
However, Lantrip fails to disclose clustering second documents in a second dataset using 
said centroid seeds. However, in col. 14, lines 10-45 of Ruocco, Ruocco discloses in the 
claim processing in parallel second datasets based on cluster information from previous 
cluster vectors (see col. 14, lines 28-30) in order to gain the benefit of information from 
previous clusters to improve analysis of subsequent datasets. Ruocco' s invention further 
may be interpreted such that said second dataset has a similar clustering to that of said 
first dataset (as the term "similar" is sufficiently broad that any two given datasets would 
have some degree of similarity, see 35 U.S.C. 1 12 rejection, above.), further wherein said 
second dataset comprises a new, but related dataset different than said first dataset (once 
the first dataset is transformed, it is by definition a new, but related dataset). It would 
have been obvious to one of ordinary skill in the art at the time of the invention to use the 
information contained in the centroid seeds from Lantrip for subsequent datasets as in 
Ruocco in order to improve analysis of subsequent datasets. 

6. Regarding dependent claim 2, Lantrip and Ruocco fail to disclose that said first dataset 
and said second dataset are related. However, it was notoriously well known in the art at 
the time of the invention that if one intends to process a dataset based on the results of 
previously processing another dataset, the datasets should be related in order for the 
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results to be meaningful. It would have been obvious to one of ordinary skill in the art at 
the time of the invention to have the first and second dataset be related in order for the 
results to be meaningful. 

7. Regarding dependent claim 3, Lantrip discloses that the clustering of said first 
documents in said first dataset comprises: forming a first dictionary of most common 
words in said first dataset (in col. 2, lines 30-40, Lantrip creates a database based on the 
dataset, which would include the common words); generating a first vector space model 
by counting, for each word in said first dictionary, a number of said first documents in 
which said word occurs (in col. 2, lines 35-42, Lantrip creates a vector space model); and 
clustering said first documents in said first dataset based on said first vector space model 
(in col. 2, lines 39-42, Lantrip carries out clustering). 

8. Regarding dependent claim 4, Lantrip fails to disclose a method further comprising 
generating a second vector space model by counting, for each word in said first 
dictionary, a number of said second document in which said word occurs. However, 
Ruocco, in col. 14, lines 20-35, discloses generating such a vector space model for 
multiple document sets in order to aid in the clustering analysis of the document sets. It 
would have been obvious to one of ordinary skill in the art at the time of the invention to 
generate a second vector space model in the manner of Ruocco in Lantrip 5 s invention in 
order to aid in the clustering analysis of the document sets. 

9. Regarding dependent claim 5, Lantrip discloses that said creating of said centroid seeds 
comprises: classifying said second vector space model using said first document classes 
to produce a classified second vector space model (col 2, lines 39-42, the vector space 
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model is clustered); and determining a mean of vectors in each class in said classified 
second vector space model, wherein said mean comprises said centroid seeds (col. 2, 
lines 43-45, the centroid is the center of mass of the clusters). 

10. Regarding dependent claim 6, Lantrip and Ruocco fail to disclose a method further 
comprising forming a second dictionary of most common words in said second dataset; 
generating a third vector space model by counting, for each word in said second 
dictionary, a number of said second documents in which said word occurs; and clustering 
said documents in said second dataset based on said third vector space model to produce 
a second dataset cluster. However, this constitutes simply extending and repeating claim 
3 to a third dataset, and it was notoriously well known in the art at the time of the 
invention that it is useful to repeat steps for multiple datasets to take advantage of their 
utility for subsequent data. It would have been obvious to one of ordinary skill in the art 
at the time of the invention to extend the steps of claim 3 to a subsequent dataset to gain 
the benefits of the analysis for that dataset. 

11. Regarding dependent claim 7, Lantrip discloses in col. 2, lines 39-45 that clustering of 
said documents in said dataset using said centroid seeds produces an adapted dataset 
cluster. However, Lantrip fails to disclose the use of multiple datasets and that the 
method further comprises comparing classes in said adapted dataset cluster to classes in 
said second dataset cluster; and adding classes to said adapted dataset cluster based on 
said comparing. However, in col. 4, lines 61-67, Rocco deals with comparing multiple 
dataset clusters in order to obtain more information about the relative status of the 
datasets. It would have been obvious to one of ordinary skill in the art at the time of the 
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invention to compare multiple dataset clusters in order to obtain more information about 
the relative status of the datasets. 

12. Regarding independent claim 8, it is a system that carries out the method of claim 1 , 
and is rejected under similar rationale. 

13. Regarding dependent claim 10, it is a system that carries out the method of claim 3, and 
is rejected under similar rationale. 

14. Regarding dependent claim 11, it is a system that carries out the method of claim 4, and 
is rejected under similar rationale. 

1 5. Regarding dependent claim 12, it is a system that carries out the method of claim 5, and 
is rejected under similar rationale. 

16. Regarding dependent claim 13, it is a system that carries out the method of claim 6, and 
is rejected under similar rationale. 

17. Regarding dependent claim 14, it is a system that carries out the method of claim 7, and 
is rejected under similar rationale. 

18. Regarding independent claim 15, it is essentially analogous to claim 1 except that it 
involves the steps of generating a vector space model of said second documents, which 
Ruocco presents in col. 14, lines 27-36, and classifying said vector space model of said 
second documents using said first document classes to produce a classified vector space 
model, which Ruocco presents in col. 14, lines 27-36. It would have been obvious to one 
of ordinary skill in the art at the time of the invention to use the Ruocco form of vector 
space analyis in addition to the Lantrip material from the rejection of Claim 1 in order to 
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enhance the classifications of the two datasets. The result would produce an invention 
that would serve to reject claim 15. 

19. Regarding dependent claim 16, it is a method that modifies claim 15 in the same 
manner that claim 3 modifies claim 1 and is rejected under similar rationale. 

20. Regarding dependent claim 17, it is a method that modifies claim 15 in the same 
manner that claim 4 modifies claim 1 and is rejected under similar rationale. 

21 . Regarding dependent claim 18, it is a method that modifies claim 15 in the same 
manner that claim 6 modifies claim 1 and is rejected under similar rationale. 

22. Regarding dependent claim 19, it is a method that modifies claim 15 in the same 
manner that claim 7 modifies claim 1 and is rejected under similar rationale. 

23. Regarding independent claim 20, Lantrip discloses a method of clustering documents 
comprising: forming a first dictionary of most common words in a first dataset (col. 2, 
lines 30-35, Lantrip forms a first dictionary of common words); generating a first vector 
space model by counting, for each word in said first dictionary, a number of said first 
documents in which said words occurs (col. 2, lines 35-40, Lantrip forms vectors); 
clustering said first documents in said first dataset based on said first vector space model 
to produce first document classes (col. 2, lines 39-42, Lantrip forms clusters), and 
determining a mena of vectors in each class in said classified second vector space model 
to produce centroid seeds; (col. 2, lines 43-45, Lantrip forms centroid seeds) and 
clustering documents in a second datasets using said centroid seeds (col. 2, lines 45-57, 
Lantrip clusters using centroids). Lantrip fails to disclose generating a second vector 
space model by counting, for each word in said first dictionary, and number of said 
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second documents in which said word occurs and classifying said second documents in 
said second vector space model using said first document classes to produce a classified 
second vector space model However, col. 14, lines 28-36 of Ruocco indicate that vector 
clustering analysis may involve multiple datasets in order to gain the benefit of 
information analysis from multiple sources. It would have been obvious to one of 
ordinary skill in the art at the time of the invention to have vector clustering analysis 
involve multiple datasets in order to gain the benefit of information analysis from 
multiple sources. 

24. Regarding dependent claim 21, it is a method that modifies claim 20 in the same 
manner that claim 6 modifies claim 1 and is rejected under similar rationale. 

25. Regarding dependent claim 22, it is a method that modifies claim 20 in the same 
manner that claim 7 modifies claim 1 and is rejected under similar rationale. 

26. Regarding independent claim 23, it is a program device embodying instruction to 
perform a method that is equivalent to Claim 1 and is rejected under similar rationale. 

27. Regarding dependent claim 25, it is a program device embodying instruction to perform 
a method that is equivalent to Claim 3 and is rejected under similar rationale. 

28. Regarding dependent claim 26, it is a program device embodying instruction to perform 
a method that is equivalent to Claim 4 and is rejected under similar rationale. 

29. Regarding dependent claim 27, it is a program device embodying instruction to perform 
a method that is equivalent to Claim 5 and is rejected under similar rationale, 

30. Regarding dependent claim 28, it is a program device embodying instruction to perform 
a method that is equivalent to Claim 6 and is rejected under similar rationale. 
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31. Regarding dependent claim 29, it is a program device embodying instruction to perform 
a method that is equivalent to Claim 7 and is rejected under similar rationale. 

Response to Amendment 

32. Applicants arguments filed 1/13/2005 have been fully considered but they are not 
persuasive. 

33. Applicant alleges that neither reference teaches a second dataset. However, as noted in 
the rejection, Ruocco teaches a dataset that goes though multiple stages, and those stages 
constitute different datasets. When sets have distinct members, they are different sets. 

34. Applicant further alleges that the prior art does not teach or suggest clustering second 
documents in a second dataset using said centroid seeds, such that said second dataset has 
a similar clustering to that of said first dataset. The Examiner notes that "similar" is a 
vague and indefinite term (hence the rejection under 35 U.S. C. 112, second paragraph, 
and hence the art is perfectly sufficient. 

Conclusion 

The prior art made of record and not relied upon is considered pertinent to applicant's 
disclosure. 

USPN 5,675,819 (filing date 6/16/1994)— Schuetze 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Jonathan D. Schlaifer whose telephone number is (571) 272- 
4129. The examiner can normally be reached on 8:30-5:00, M-F. 



Application/Control Number: 09/669,680 



Page 10 



Art Unit: 2178 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Stephen Hong can be reached on (571) 272-4124. The fax phone number for the 
organization where this application or proceeding is assigned is 703-872-9306. 

Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 




